2 |
Investigating alignment interpretability for low-resource NMT
|
|
|
|
In: ISSN: 0922-6567 ; EISSN: 1573-0573 ; Machine Translation ; https://hal.archives-ouvertes.fr/hal-03139744 ; Machine Translation, Springer Verlag, 2021, ⟨10.1007/s10590-020-09254-w⟩ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Is there a bilingual disadvantage for word segmentation? A computational modeling approach
|
|
|
|
In: ISSN: 0305-0009 ; EISSN: 1469-7602 ; Journal of Child Language ; https://hal.archives-ouvertes.fr/hal-03498905 ; Journal of Child Language, Cambridge University Press (CUP), 2021, pp.1-28. ⟨10.1017/S0305000921000568⟩ (2021)
|
|
BASE
|
|
Show details
|
|
4 |
SM to: Is there a bilingual disadvantage for word segmentation? A computational modeling approach ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Early Tashelhiyt Berber word segmentation: the role of the Possible Word Constraint ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Discovering structure in speech recordings: Unsupervised learning of word and phoneme like units for automatic speech recognition
|
|
|
|
In: Fraunhofer IAIS (2021)
|
|
Abstract:
While speech recordings are easy to obtain, the transcription of those recordings can be very costly and time-consuming. Therefore, automatic methods to derive such transcriptions from unlabeled data can help simplifying the training of speech recognizers in languages where little to no labeled training data is available. This thesis investigates and introduces methods to automatically learn transcriptions from audio recordings only. Algorithms for the unsupervised learning of phonemes, the smallest units in speech, and words are presented. These methods can then be used for the automatic training of a speech recognizer from unlabeled data. This thesis investigates these unsupervised learning methods separately for the learning of phonemes and words. The main focus of this thesis is laid on the unsupervised learning of words in hierarchical models consisting of phoneme and word transcriptions. Three main approaches are investigated. Firstly, heuristic methods. Secondly, two variants of statistical model-based approaches. The first variant is based on a probabilistic pronunciation lexicon while the second approach is based on word segmentation over lattices, instead of a single best sequence. Finally, a fully unsupervised system with unsupervised phoneme discovery and unsupervised word segmentation combined, is presented. The thesis concludes by integrating the unsupervised phoneme and word discovery into a semantic inference task in the setting of a simple command and control interface to demonstrate the usability of unsupervised learned phonemes and words in upstream tasks and their ability to improve their performance over purely supervised methods.
|
|
Keyword:
Acoustic Unit Discovery; ASR; automatic speech recognition; unsupervised learning; Unsupervised Word Segmentation
|
|
URL: http://publica.fraunhofer.de/documents/N-644770.html https://doi.org/10.17619/UNIPB/1-1252
|
|
BASE
|
|
Hide details
|
|
7 |
Handling cross and out-of-domain samples in Thai word segmentation
|
|
|
|
In: 1003 ; 1016 (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Measuring (online) word segmentation in adults and children
|
|
|
|
In: Dutch Journal of Applied Linguistics, Vol 10 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Investigating Language Impact in Bilingual Approaches for Computational Language Documentation
|
|
|
|
In: Proceedings of the 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020), ; SLTU-CCURL workshop, LREC 2020 ; https://hal.archives-ouvertes.fr/hal-02895907 ; SLTU-CCURL workshop, LREC 2020, May 2020, Marseille, France (2020)
|
|
BASE
|
|
Show details
|
|
10 |
F0 Slope and Mean: Cues to Speech Segmentation in French
|
|
|
|
In: Interspeech 2020 ; https://hal.archives-ouvertes.fr/hal-03042331 ; Interspeech 2020, Oct 2020, Shanghai, China. pp.1610-1614, ⟨10.21437/Interspeech.2020-2509⟩ (2020)
|
|
BASE
|
|
Show details
|
|
11 |
The learnability consequences of Zipfian distributions: Word Segmentation is Facilitated in More Predictable Distributions ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Data for: The learnability consequences of Zipfian distributions: Word Segmentation is Facilitated in More Predictable Distributions ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
The learnability consequences of Zipfian distributions: Word Segmentation is Facilitated in More Predictable Distributions ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Infants Segment Words from Songs—An EEG Study
|
|
|
|
In: Brain Sciences ; Volume 10 ; Issue 1 (2020)
|
|
BASE
|
|
Show details
|
|
16 |
Not all words are equally acquired: transitional probabilities and instructions affect the electrophysiological correlates of statistical learning
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention
|
|
|
|
In: International Workshop on Spoken Language Translation ; https://hal.archives-ouvertes.fr/hal-02343206 ; International Workshop on Spoken Language Translation, Nov 2019, Hong-Kong, China (2019)
|
|
BASE
|
|
Show details
|
|
18 |
Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus
|
|
|
|
In: EISSN: 2470-2986 ; Open Mind ; https://hal.archives-ouvertes.fr/hal-02274050 ; Open Mind, MIT Press, 2019, 3, pp.13-22. ⟨10.1162/opmi_a_00022⟩ (2019)
|
|
BASE
|
|
Show details
|
|
19 |
Unsupervised word discovery for computational language documentation ; Découverte non-supervisée de mots pour outiller la linguistique de terrain
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-02286425 ; Artificial Intelligence [cs.AI]. Université Paris Saclay (COmUE), 2019. English. ⟨NNT : 2019SACLS062⟩ (2019)
|
|
BASE
|
|
Show details
|
|
20 |
MiNgMatch—A Fast N-gram Model for Word Segmentation of the Ainu Language
|
|
|
|
In: Information ; Volume 10 ; Issue 10 (2019)
|
|
BASE
|
|
Show details
|
|
|
|